Incorporating Spatial Similarity into Ensemble Clustering
نویسندگان
چکیده
This paper addresses a fundamental problem in ensemble clustering – namely, how should one compare the similarity of two clusterings? The vast majority of prior techniques for comparing clusterings are entirely partitional, i.e., they examine assignments of points in set theoretic terms after they have been partitioned. In doing so, these methods ignore the spatial layout of the data, disregarding the fact that this information is responsible for generating the clusterings to begin with. In this paper, we demonstrate the importance of incorporating spatial information into forming ensemble clusterings. We investigate the use of a recently proposed measure, called CDistance, which uses both spatial and partitional information to compare clusterings. We demonstrate that CDistance can be applied in a wellmotivated way to four areas fundamental to existing ensemble techniques: the correspondence problem, subsampling, stability analysis and diversity detection.
منابع مشابه
Toward Multi-Diversified Ensemble Clustering of High-Dimensional Data
The emergence of high-dimensional data in various areas has brought new challenges to the ensemble clustering research. To deal with the curse of dimensionality, considerable efforts in ensemble clustering have been made by incorporating various subspace-based techniques. Besides the emphasis on subspaces, rather limited attention has been paid to the potential diversity in similarity/dissimila...
متن کاملSpectral Clustering Ensemble and Unsupervised Clustering for Land cover Identification in High Spatial Resolution Satellite Images
Unsupervised clustering plays a dominant role in detailed landcover identification specifically in agricultural and environmental monitoring of high spatial resolution remote sensing images. A method called Approximate Spectral Clustering enables spectral partitioning for big datasets to extract clusters with different characteristic without a parametric model. Various information types are use...
متن کاملFrom Subspaces to Metrics and Beyond: Toward Multi-Diversified Ensemble Clustering of High-Dimensional Data
The emergence of high-dimensional data in various areas has brought new challenges to the ensemble clustering research. To deal with the curse of dimensionality, considerable efforts in ensemble clustering have been made by incorporating various subspace-based techniques. Besides the emphasis on subspaces, rather limited attention has been paid to the potential diversity in similarity/dissimila...
متن کاملWeighted Ensemble Clustering for Increasing the Accuracy of the Final Clustering
Clustering algorithms are highly dependent on different factors such as the number of clusters, the specific clustering algorithm, and the used distance measure. Inspired from ensemble classification, one approach to reduce the effect of these factors on the final clustering is ensemble clustering. Since weighting the base classifiers has been a successful idea in ensemble classification, in th...
متن کاملImproving Accuracy in Intrusion Detection Systems Using Classifier Ensemble and Clustering
Recently by developing the technology, the number of network-based servicesis increasing, and sensitive information of users is shared through the Internet.Accordingly, large-scale malicious attacks on computer networks could causesevere disruption to network services so cybersecurity turns to a major concern fornetworks. An intrusion detection system (IDS) could be cons...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010